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Abstract 

Precise control of lineage-specific gene expression in the neural stem/progenitor cells is crucial for 
generation of the diversity of neuronal and glial cell types in the central nervous system (CNS). 
The mechanism underlying such gene regulation, however, is not fully elucidated. Here, we report 
that a 377 bp evolutionarily conserved DNA fragment (CR5), located approximately 32 kbp 
upstream of OUg2 transcription start site, acts as a c/i-regulator for gene expression in the 
development of the neonatal forebrain. CR5 is active in a time-specific and brain region-restricted 
manner. CR5 activity is not detected in the embryonic stage, but it is exclusively in a subset of 
Sox5h- cells in the neonatal ventral forebrain. Furthermore, we show that Sox5 binding motif in 
CR5 is important for this cell-specific gene regulatory activity; mutation of Sox5 binding motif in 
CR5 alters reporter gene expression with different cellular composition. Together, our study 
provides new insights into the regulation of cell-specific gene expression during CNS 
development. 
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Introduction 

During the development of the central nervous system (CNS), the cellular diversity emerges 
largely from controlled spatiotemporal segregation of cell type-specific molecular regulators 
(Butt et al., 2005; Fode et al., 2000; Hoshino, 2012; Lo et al., 2002; Molyneaux et al., 2007; 
Parras et al., 2002). A large number of different neurons and glial cells are derived from a 
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population of self-renewing stem and progenitor cells. Well-orchestrated Uneage-specific 
gene expression in the neural stem/progenitor cells is crucial for the generation of these 
neuronal and glial cell types (Dessaud et al., 2008; Panman et al., 201 1). However, it is not 
quite clear how the differentiation of neural progenitors and the acquisition of their cell-fate 
is processed and programmed. 

Different types of neurons and glial cells in the brain originate from separate progenitor 
pools in distinct areas (Marin and Rubenstein, 2001). Many transcription factors are 
important regulators during this neural differentiation process. One group of such 
transcription factors are the basic helix-loop-helix (bHLH) transcription factors which 
involved in the determination of neural cell fates (Akagi et al., 2004; Bertrand et al., 2002). 
Oligodendrocyte transcription factor (Olig) is a family of bHLH proteins that has received 
great attention for its essential role in neural cell specification and differentiation (Lu et al., 
2001; Lu et al., 2002; Yu et al., 2013; Zhou and Anderson, 2002). The expression of the 
Olig gene family is predominantly restricted to the CNS (Lu et al., 2000; Zhou et al., 2000). 
01ig2, a member of Olig gene family, is required for the formation of oligodendrocyte and 
motoneuron progenitors; 01ig2 null mouse embryos do not form oligodendrocytes and die at 
birth (Lu et al., 2002; Zhou and Anderson, 2002). Although the role of 01ig2 in the 
development of the CNS has been well established, little is known about the molecular 
mechanism underlies spatiotemporal 01ig2 expression during development. Other factors 
involved in neurogenesis during the CNS development include the Sox family transcription 
factors (Azim et al., 2009; Wegner and Stolt, 2005). Sox5 is a member of the Sox D group 
widely expressed in the developing forebrain and involved in the formation of the cephalic 
neural crest, in the control of cell cycle progression in neural progenitors, and of the 
sequential generation of distinct corticofugal neuron subtypes (Lai et al., 2008; Martinez- 
Morales et al., 2010; Perez-Alcala et al., 2004). 

Neurogenesis is a developmental process highly conserved across a wide range of species 
(Finlay and Darlington, 1995; Gomez-Skarmeta et al., 2006). The evolutionarily conserved 
non-coding component of the genome is known to play an essential role in regulating this 
developmental process and has been receiving increased attention because of its predicted 
function in regulation of transcription, DNA replication, chromosome pairing, and 
chromosome condensation (Blackwood and Kadonaga, 1998; Jeziorska et al., 2009; Long 
and Miano, 2007). DNA sequences involved in gene regulation through the binding of 
transcriptional factors, termed as c/i-regulatory elements, enhance or suppress gene 
expression in a spatiotemporal manner. The regulatory function is independent of orientation 
or position relative to the transcription sites (Blackwood and Kadonaga, 1998; Jeziorska et 
al., 2009). Several non-coding DNA fragments have been demonstrated to influence the 
expression of the Oligl and/or 01ig2 genes (Friedli et al., 2010; Sun et al., 2006). However, 
the trans-acting factors that activate these DNA fragments still need to be identified. 

In search for distant c/i-elements of 01ig2 gene, we identified a highly evolutionarily 
conserved non-coding DNA element (CR5) upstream of the 01ig2 gene transcription start 
site. CR5 plays essential roles in regulating gene expression in a subpopulation of Sox5h- 
cells that exclusively located in the ventral forebrain during the neonatal CNS development. 
We present evidence that the binding motif for Sox5 is important for the regulatory activity 
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of this c;i-element. Our findings may provide new insights into molecular mechanism 
underlying cell-specific gene expression. 

Material and methods 

Sequence alignment analysis 

The sequence and annotation of the mouse 01ig2 gene along with its homologs from the 
human, rat, cow and zebrafish genomes were retrieved using NCSRS (Doh et al., 2007). The 
sequences were analyzed by VISTA (Frazer et al., 2004) to identify highly conserved 
regions (CR). The percent identity and the length of the conserved sequence were used to 
calculate a score for each conserved region (score=percent identity+(length/60)). A limit of 
2 kb in sequence length was implemented in order to isolate individual cw-elements for this 
study. Based on this scoring system, the percent identity was more heavily weighted to 
ensure that shorter and highly conserved sequences are not ranked below longer sequences 
with lower levels of conservation (Fig. SI). 

Reporter plasmid constructions 

Computationally-predicted conserved regions were amplified using the Taq PCR Kit (New 
England Biolabs, MA) following the routine Taq PCR reaction protocol. Primers used were 
summarized in Table 1 . Mouse genomic DNA (Swiss Webster strain) was extracted from an 
adult mouse tail and used as the PCR template for all primers. A random extension sequence 
(CGATATAT) and the Spel recognition sequence (ACTAGT) was added to the 5' end of the 
forward primer, plus a random extension sequence and Fsel recognition sequence 
(GGCCGGCC) was added to the 5' end of the reverse primer. Then, the sticky end inserts 
were digested, gel purified, and ligated into the PGP-GFP backbone which was linearized 
with Fsel and Spel to generate experimental constructs (Fig. SI). 

Animals and ethics statement 

For in vivo and in utero electroporation experiments, Swiss Webster mice were purchased 
from Charles River Laboratories (Wilmington, MA) and maintained on a 12 h/12 h (7:00 
a.m. to 7:00 p.m.) light/dark schedule from the time of arrival until the time of the 
experiment. Pregnancies were timed from the day at which a vaginal plug was detected, 
which was designated as embryonic day 0 (EO). By this convention, birth would normally 
occur on E19. This strain was also used as recipient to implant 0.5 dpc (days post coitum) 
embryos for transgenic animal studies. Mice were randomly assigned to distinct 
experimental groups. All studies were conducted in accordance with the NIH guidelines for 
the care and use of animals with approved animal protocol from the Institutional Animal 
Care and Use Committees at the Rutgers University. 

In vivo electroporation 

Individual experimental plasmid DNA constructs (2-3 |Jg/|Jl) were mixed with the control 
plasmid (2-3 Hg/|Jl) to make the working DNA mixture. 1 |j1 DNA mixture was delivered 
into the mouse brain at postnatal day 0 (PO) targeting the SVZ progenitors (Fig. SI) with a 
Hamilton syringe. Five square pulses (80 V) of 50 ms duration with 950 ms intervals were 
then applied using a pulse generator ECM 830 (BTX Harvard Apparatus). 
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In utero electroporation 

Timed pregnant Swiss Webster female mice (Charles River Labs) were anesthetized by 
intraperitoneal delivery of 0.7-0.9 ml of 2.5% avertin. The abdomen was opened to expose 
the uterine horns. The DNA solution (1 |Jg/|Jl experimental plasmid DNA+0.025% fast 
green) was injected into the lateral ventricle of embryonic brains at E15.5 using a pulled 
glass micropipette. After injection, the head of each embryo was placed between tweezer- 
type electrodes (BTX Harvard Apparatus) and five square electric pulses (37 V, 50 ms) were 
deUvered with 950 ms intervals using a pulse generator ECM 830 (BTX Harvard 
Apparatus). The wall and skin of the abdominal cavity were then sutured and closed. 

Generation of transgenic mice 

Digested DNA (CR5-GFP) was gel purified using Seakem GTG agarose gel. Purified DNA 
(3-5 pg) was introduced by microinjection into 0.5 dpc (days post coitum) fertilized Fl 
(C57B1/6J X CBA, Jackson Labs) mouse embryos and transferred to pseudopregnant 
recipient females. Reimplanted embryos were allowed to develop in utero to a time point 
that recipient female were sacrificed or allowed to give birth. Skin or tail DNA was prepared 
following standard protocol for genotyping. The transmission of the transgene in following 
generations was verified by Southern blotting and/or PGR genotyping (forward primer: 
GCA ACG TGC TGG TTA TTG TGC TGT; reverse primer: GTG GTA TTT GTG AGC 
CAG GGC ATT). 

Tissue harvesting, processing and immunohistochemistry 

Tissues from mouse brain were harvested at various embryonic and postnatal stages, fixed in 
4% paraformaldehyde overnight, and washed in PBS 3 times for 5 min at 4 °C. Tissues were 
cryoprotected in 30% sucrose overnight until they became submerged in solution at 4 °C; 
they were embedded in OCT, sectioned at 10-15 |jm thickness using a cryostat (Thermo 
0620E), mounted on Superfrost slides (Fisher Scientific), and air-dried for 30 min. 
Immunostaining was performed using a Shandon Slide Rack (Thermo Scientific, MA) as 
previously described (Doh et al., 2010). Sections were incubated in blocking solution 
(0.05% Triton X-100, 10% goat serum, 3% BSA in PBS) for 30 min at room temperature 
followed by an overnight incubation with primary antibodies. GFP signal was retrieved by 
staining with anti-GFP (1:1000 dilutions, Invitrogen; 1:500 dilution. Abeam). Other primary 
antibodies included anti-NeuN (1:1000 dilution, Millipore), Sox5 (1:200 dilution, Santa 
Cruz), NG2 (1:200 dilution, Millipore), Tbrl (1:200 dilution, Santa Cruz), Tbr2 (1:200 
dilution, abeam), Gsxl (1:200 dilution, Santa Cruz), PDGFRa (1:1000 dilution, abeam), 
GFAP (1:1000 dilution, Sigma), BLBP (1:1000 dilution, Chemicon), Pax6(l:200 dilution, 
Millipore), Mashl(l:100 dilution, BD Biosciences), SIOOP (1:1000 dilution, Sigma), PH3 
(1:100 dilution, abeam), and Ki67 (1:100 dilution, BD Pharmingen). Staining with anti- 
01ig2 antibody (a gift from Dr. Charles Stiles at Harvard University) required pre-heating of 
slides with 1 mM Tris-EDTA buffer (PH 8.5) at 96 °C for 12 min to retrieve the antigen. 
Slides were then washed with PBS. Subsequently, tissue sections were incubated with 
appropriate secondary antibodies conjugated to different fluorophores (donkey anti-RbIgG 
Alexa 488 or donkey anti-GtIgG Alexa 488, 1:300, Jackson ImmunoResearch Labs) 
(donkey anti-mIgG Alexa 647 or donkey anti-RbIgG Alexa647, 1: 150, Jackson 
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ImmunoResearch Labs). Secondary antibodies were prepared in blocking buffer and applied 
at room temperature for 1 h, followed by three 10 min washes with PBS and a 5 min rinse in 
distilled water to remove salt crystals. After air-drying for 5 min, slides were mounted with 
40 |j1 of mounting media with Dapi (Vector Laboratories). 

Cell counting and statistical analysis 

For counting double-labeled cells, confocal images were captured using a digital Zeiss 
AxioCam MR camera using a Zeiss Axio Imager Zl with ApoTome application (Optical 
sectioning using structured illumination), and analyzed to detect GFPh- cells and cells 
stained with a specific marker. The number of GFPh- cells and GFPh- cells co-stained with a 
cell type-specific marker was counted manually in 4-5 sections from at least 3 animals per 
time point. In addition, DAPI staining was used to ensure that GFP+ cells were co-labeled 
with a cell marker (Fig. S2). The percentage of co-labeled cells over total number of GFP+ 
cells was determined. Results were presented as mean + SD. Statistical significance was 
determined using student's f-test at the level ofp< 0.01. 

Electrophoretic mobility shift assay (ElVISA) 

EMSA was performed to identify sub-regions of CR5 with potential binding activity with 
transcription factors. Matlnspector, an online search tool from Genomatix (Germany) that 
provides potential trans-acting factor binding sites in nucleotide sequences (Cartharius et al., 
2005; Quandt et al., 1995; Werner, 2000), was used to identify known sequence-specific 
binding sites for protein factors. Double-stranded DNA probes (40-80 bp in sequence 
length) were designed to span the entire conserved region (Table 2). Probes were 
synthesized (IDT, Piscataway, NJ) as single stranded oligonucleotides, biotinylated using 
the Biotin 3' End DNA Labeling Kit (Thermo Fisher Scientific) and annealed at room 
temperature one hour immediately prior to binding assay. Unlabeled single stranded probes 
were annealed and used as double-stranded competition probes. A ratio of 40:1 was used for 
competition probe to labeled probes. Nuclear extracts were prepared from the brain tissues 
(the VZ and SVZ of the cerebral cortex, the striatum and the ventral forebrain) of the Swiss 
Webster mice at various stages. The EMSA binding reaction and competition reaction were 
performed according to the LightShift Chemiluminescent EMSA Kit (Thermo Fisher 
Scientific) protocol. The reaction mixture was loaded onto a 10% non-denaturing 
polyacrylamide gel containing 0.5 x TBE (40 mM Tris, 40 mM borate, 1 mM EDTA). Mini 
(8 X 8 X 0.1 cm) gels were run at 100 V for 220 min at 4 °C and dried under vacuum. 

Chromatin immunoprecipitation assay (ChIP) 

Chip assays were performed to determine which transcription factors bind to the verified 
regulatory regions of CR5 in embryonic mouse tissue using a commercially available kit 
(MAGnify™ Chromatin Immunoprecipitation System, Invitrogen). The brain tissues (the VZ 
and SVZ of the cerebral cortex, the striatum and the ventral forebrain) were harvested from 
Swiss Webster mice at various stages. The brain tissue was homogenized through pipetting. 
Dissociated cells were cross-linked for 15 min at 37 °C with 1% formaldehyde. The cells 
were incubated in lysis buffer containing protease inhibitor for 5 min on ice. Sonication for 
4-6 cycles of 30 s 'ON' and 30 s 'OFF' yielded 500-1000 bp fragments of sheared 
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chromatin. The remaining steps were performed following the manufacturer's instructions. 
Antibodies for immunoprecipitation included anti-Phox2a, Phox2b and Sox5 (Santa Cruz 
Biotechnology). ChIP DNA from individual experiments were amplified by PCR using the 
primers listed in Table 1 . 

Results 

CR5 is active in thie neural progenitor cells of neonatal forebrain 

To identify c/i-elements that regulate cell-specific gene expression, we performed 
comparative sequence analysis. Twelve conserved regions (CR1-CR12) surrounding the 
01ig2 gene locus were identified (Fig. SI A); their ability to direct tissue/cell-specific gene 
expression during mouse CNS development was screened by reporter assays using in vivo 
electroporation (IVE). The experimental DNA construct (containing a conserved region and 
GFP as a reporter) was co-injected with a transfection control (CAG-DsRed) into the 
developing mouse forebrain at postnatal day 0 (PO) and electroporated to transfect the neural 
progenitor cells in the ventricular zone (VZ) and/or the subventricular zone (SVZ) (Fig. SIB 
and C). Reporter GFP expression was examined in transfected tissues at various stages 
during the CNS development. Negative control experiments were performed with DNA 
constructs containing a minimal j3GP alone without a conserved element or with a random 
DNA sequence comparable in size to ensure GFP reporter expression is solely due to 
activity of a CR. No GFP expression in transfected tissues was detected at any examined 
stages, indicating that the minimal PGP alone or a random sequence could not direct GFP 
expression. Five of the 12 CRs (CRl, 3, 4, 5 and 8; see Table 1) showed gene regulatory 
activity and CRS was the strongest. Thus, its gene regulatory activities were further 
characterized. CRS (Chr. 16:91082197-91082S73) is a 377 bp non-coding DNA fragment 
located -32 kbp upstream of 01ig2 transcription start site. It is 100% conserved among 
various mouse strains, including CS7BL/6, CBA, and Swiss Wester. 

We further examined the transfection derived GFPh- cells at P7 after in vivo electroporation 
at PO. GFPh- cells were detected in the SVZ and the corpus callosum of the postnatal 
forebrain (Fig. lA-C). The experimental CRS-GFPh- cells (Fig. lA) and the control CAG- 
GFPh- cells (Fig. IB) had dramatically different morphologies in the P7 forebrain. For 
example, the control CAG-GFPh- cells contain various morphologically different cells 
resemble astrocyte, oligodendrocyte, and radial glia (Fig. 1B,B1); while the majority of 
CRS-GFPh- cells were with a radial glia or a undifferentiated cell morphology (Fig. 1A,A1). 

Since CRS possess gene regulatory activity in neonatal stages, its activity during embryonic 
development was then examined using in utero electroporation (lUE). The lateral ventricles 
at embryonic day IS.S (EIS.S) were injected and electroporated with the experimental 
construct (CRS-GFP) to transfect the neural progenitors lining the VZ/SVZ. No CRS-GFP 
expression was detected in the transfected tissues at E17.S two days after lUE (Fig. ID-F), 
indicating that CRS may not be active at this embryonic stage. To rule out the possibility 
that the lack of CRS-GFP+ expression might be caused by the failure of electroporation, a 
control construct CAG-DsRed was co-electroporated with CRS-GFP construct. Strong 
CAG-DsRedH- cells (red) were observed in all transfected brain tissues (Fig. ID), confirming 
that CRS activity was not detected in the embryonic stage. 
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CR5 activity is preferentially in the Sox5+/NG2+ progenitors of the neonatal forebrain 

We next determined the cellular identity of transfection-derived CR5-GFP+ cells by 
immunostaining with neural stem/progenitor markers of Sox5 (Fig. 2A and F), NG2 (Fig. 2B 
and G), 01ig2 (Fig. 2C and H), BLBP (Fig. 2D and /), Mashl (Fig. 2E and J), GFAP (Fig. 
S3 A and B); astrocyte marker SIOOP (Fig. S3 C and D), and neuronal marker NeuN (Fig. 
S3 E and F). Compared with the control CAG-GFP+ cells, a significantly higher percentage 
of CR5-GFP+ cells were co-labeled with Sox5 (52.6% vs 23.1%) and NG2 (68.9% vs 
27.3%) (Fig. 2K). We found no CR5-GFP+ cells were co-labeled with NeuN (Fig. S3E and 
F and Fig. 2K), indicating that CR5 activity is not in the differentiated neurons. The fact that 
the majority of the CR5-GFPH- cells were co-labeled with Sox5 and NG2 indicates that CR5 
was preferentially active in Sox5h-/NG2h- progenitors during neonatal forebrain 
development. To our surprise, only 37.8% of CR5-GFPH- cells were co-labeled with 01ig2 
(Fig. 2C, H and K), suggesting that CR5 may not be a cii-element/enhancer for 01ig2 
expression (Fig. 2C and H and Table S 1). 

CR5 activity is exclusively in a subset of Sox5+ cells in the neonatal ventral forebrain 

The electroporation targets only a limited regional cell population and results in a transient 
episomal gene expression. To determine the spatiotemporal gene regulatory activity of CR5, 
we thus generated transgenic mice containing the CR5-GFP construct. The transgene CR5- 
GFP expression was examined at various embryonic (El 1.5, E13.5, E15.5, E17.5, E19.5) 
and postnatal stages (PO, P7, P14, and P21). Consistent with results obtained from the in 
vivo/utero electroporation, CR5-GFPH- cells in transgenic mice were detected at a neonatal 
stage, E19.5 (n=2)/P0 (n=9); and interestingly, they were predominately located in the 
ventral forebrain (Fig. 3A). Even with extensive examinations, no GFP expression was 
detected at embryonic stages and other postnatal stages we examined (data not shown). 
Thus, CR5 activity is exclusively in the neonatal ventral forebrain. 

We then determined the cellular identities of transgenic CR5-GFPH- cells by immunostaining 
the sagittal sections of PO samples. We found that the vast majority of CR5-GFPH- cells were 
co-labeled with Sox5 (82.6%, n=3; p < 0.01) (Fig. 3B and G). Although some CR5-GFPH- 
cells also co-labeled with other progenitor markers, the percentages of co-labeled cells were 
relatively low, e.g., NG2 (19.8%, n=3. Fig. 3B and C), Gsxl (28.7%, n=3. Fig. 3B and F), 
and mitotic cell marker PH3 (37%, n=3. Fig. 3B and G). Thus, CR5 activity is 
predominantly in the Sox5h- cells. 

Sox5 is known to be expressed in neural progenitors and controls neocortical nem^on 
differentiation (Azim et al., 2009; Greig et al., 2013; Perez-Alcala et al., 2004). However, its 
role in the ventral forebrain development is not clear. We then determined the pattern of 
Sox5 expression by immunohistochemistry. A peak Sox5 expression was detected in both 
the dorsal and ventral forebrain at PO (Fig. 4A-C). Sox5 expression was not detected at 
E17.5 (Fig. AD and E) and dramatically reduced by P7 (Fig. AD and F) in the ventral 
forebrain. This ventral forebrain expression pattern of Sox5 is well correlated with CR5 
activity in this brain region. 



Dev Biol. Author manuscript; available in PMC 2014 September 19. 



Hao et al. 



Page 8 



Similar to electroporation results, a small percentage of CR5-GFP+ cells were co-labeled 
with oligodendrocyte-lineage markers 01ig2 (2%) and PDGFRa (1.5%), astrocyte marker 
SIOOP (3.2%), radial glia markers GFAP (4.5%) and BLBP (1.6%), intermediate progenitor 
markers Tbrl (2.2%), Tbr2 (3.6%), and Pax6 (3.3%) (Fig. 3 and Fig. S4). These results 
suggest that CR5 activity is time-specific, brain region-restricted and is highly induced in a 
subset of Sox5h- progenitor cells in the ventral forebrain. 

Specific nuclear factors bind to CR5 

Activation of CR5 requires the binding of transcription factor(s). To determine which 
specific nuclear factors may bind to CR5, we performed the electrophoretic mobility shift 
assays (EMSA). A total of 7 probes were designed to span the whole length of CR5 (Table 
2). EMSA results showed that two sub-regions within CR5 corresponding to probe #1 (Fig. 
5A and B) and probe #2 (Fig. 5C and D) have specific nuclear protein-binding activity. We 
analyzed the transcription factor binding sites within these two sub-regions using 
Matlnspector (Genomatix, Germany) (Cartharius et al., 2005; Quandt et al., 1995) to 
identify candidate binding sites for the transcription factors (the last column in Table 2). 
After further literature search, we narrowed down the number of candidate transcription 
factor binding sites to Sox5, Phox2a and Phox2b. To determine whether these three factors 
bind with CR5 in vivo, we performed the chromatin immunoprecipitation (ChIP) assays 
using chromatin obtained from brain tissues at various developmental stages (PO, P7, P14, 
P21 and adult) and immunoprecipitated using antibodies against the three protein factors 
individually. Results from ChIP assays showed that Sox5 and Phox2a (but not Phox2b) 
bound with CR5 (Fig. 5E), suggesting that binding of Sox5 and Phox2a is important for the 
activation of CR5 and its gene regulatory activity. 

Lacit of Sox5 binding site in CR5 alters gene regulatory activity 

Since the majority of CR5-GFPH- were co-labeled with Sox5 (Fig. 3B and G) and Sox5 is 
known to control cell cycle progression in neural progenitors and production of distinct 
neuronal cell types (Martinez-Morales et al., 2010; Perez-Alcala et al., 2004), we thus 
focused our further investigation on the role of Sox5 in CR5 activation. We next determined 
whether the Sox5 binding site is required for gene regulatory activity using site-directed 
mutagenesis assay. Mutant CR5^^°''^-GFP construct (deletion of AT from CAAT) (Fig. 6A 
and B) was injected into the PO mouse brain (for all data points, n > 3) followed by 
electroporation to transfect the neural progenitors in the SVZ. To our surprise, GFPh- cells 
were observed from the mutant construct CR5^^°''^-GFP, comparable to the wild-type CR5- 
GFPh- cells (Fig. 6A') with no obvious difference (Fig. 6B'), suggesting that Sox5 binding 
site is not essential for CR5 activation. However, after further analysis of the molecular 
identities of the resulting GFPh- cells, we found that CR5^^°''5-GFPh- cells and CR5-GFPH- 
ceUs have a different distribution (Fig. 6G and Table S2). Compared with the CR5-GFPH- 
cell population, a significant lower percentage of CR5^^°''^-GFPh- cells were co-localized 
with Sox5 (52.6% vs 16.98%; Fig. 6C), NG2 (68.9% vs 23.98%; Fig. 6D) and 01ig2 (36.6% 
vs 19.98%; Fig. 6E); while a higher percentage of CR5^^°''^-GFPh- cells were co-labeled 
with BLBP (35.2% vs 58.58%; Fig. 6F). These findings suggest that although Sox5 binding 
site is not required for the activation of CR5, it affects the cellular specificity of CR5 in its 
gene regulatory activity. 
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Discussion 

The molecular mechanism underlying lineage-specific gene expression is a key to the 
understanding of how cell diversity is generated during the development of the CNS. In this 
study, we report the identification of a novel c/.?-element, CR5, an evolutionarily conserved 
non-coding DNA fragment located upstream of 01ig2 locus, capable of regulating gene 
expression especially in a subpopulation of Sox5h- progenitor cells in the ventral forebrain 
during neonatal development. We demonstrated that Sox5 binding site in CR5 is important 
for cell-specific gene regulatory activity. 

The VZ/SVZ progenitors in the embryonic and postnatal mammalian forebrain are known to 
generate olfactory interneurons, astrocytes, and oligodendrocytes (Marshall et al., 2005; 
Marshall et al., 2003; Menn et al., 2006). The SVZ population also contains radial glia that 
serves as neural progenitors (Middeldorp et al., 2010). Consistent with previous 
observations, our electroporation experiments showed that CAG driven reporter GFP or 
DsRed expression was detected in all above mentioned cell types in both embryonic and 
neonatal stages, while CR5 driven GFP expression was only detected in the SVZ at a stage 
between PO to P7, but not in the embryonic stages (Figs. 1 and 2 and Fig. S3). Together with 
data from the transgenic mice (Fig. 3), we demonstrated that CR5 activity exists only in the 
neonatal developing brain. 

In both the in vivo electroporation and transgenic mouse studies, the majority of CR5-GFPH- 
cells were co-localized with Sox5 (82.6% in transgenic animals and 52.6% in 
electroporation experiments) indicating that CR5 activity is preferentially in a subpopulation 
of Sox5h- progenitors. The difference in the percentage of CR5-GFPh-/Sox5h- cells observed 
in the two experiments may be due to the fact that CR5-GFP construct targets only the SVZ 
progenitors at PO in the electroporation experiment and resulted in a transient episomal GFP 
expression; while in transgenic animals, CR5-GFP construct targets the embryonic stem 
cells, CR5-GFP was integrated into the genome, and resulted in a spatiotemporal regulated 
GFP expression. We noticed that there was also a large difference in the percentages of 
CR5-GFPh-/NG2h- cells, i.e., 19.8% in transgenic animals and 68.9% in electroporation 
experiments. This can be explained again by the difference of the two methods targeting 
different cell population, i.e., embryonic neural stem cells in transgenic mouse study vs SVZ 
progenitors at neonatal stage (PO) in electroporation experiments. 

It is interesting that CR5-GFPH- cells were not detected in the postnatal SVZ of the 
transgenic animals (Figs. 1-3), indicating that CR5 activity may not be in the SVZ region of 
the cerebral cortex. Alternatively, it is also possible that the CR5-GFP level was too low to 
be detected. Although, CR5 activity may not be in the cerebral cortex, the electroporation 
derived GFPh- cells may be a result of the electrical stimulation to the SVZ progenitors. The 
transgenic experiments reveal the spatiotemporal activity of CR5 during CNS development, 
while the in vivo electroporation experiments provide a rapid screen method for functional 
cis-elements. Our results from the CR5-GFP transgenic mouse study provide not only 
evidence to support the conclusion that CR5 activity is in a subpopulation of Sox5h- 
progenitors but also further determined that this progenitor population is located in the 
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neonatal ventral forebrain (Figs. 3 and 4). Thus, CR5 activity is in a spatiotemporally 
restricted manner. 

Sox5 is well known for its function in controlling cell cycle and sequential generation of 
distinct corticofugal neuron subtypes. Our analysis revealed a novel role of Sox5 in neural 
progenitors of the neonatal ventral forebrain regulated by CR5 activity (Fig. 3). To support 
this, a transient Sox5 expression was detected in the PO ventral forebrain (Fig. 4). By P7, 
Sox5 expression was dramatically decreased and barely detectable (Fig. 4). The observation 
that no CR5-GFP+ cells were detected in the transgenic neocortex where Sox5 is highly 
expressed implies that CR5 was not active in this brain region. Thus, Sox5 expression in 
different brain areas may be regulated by different mechanisms. 

Despite its location upstream of the 01ig2 gene, CR5 may not directly regulate 01ig2 
expression as the majority of CR5-GFP+ cells were not co-localized with 01ig2 or 
oligodendrocyte progenitor marker PDGFRa (Figs. 2-3 and Fig. S3). Thus, CR5 activity 
may not be in the oligodendrocyte lineage. Since CR5-GFP+ cells were co-localized with 
Sox5 and not with 01ig2, thus, Sox5 may not directly regulate 01ig2 expression. This is 
consistent with the role of Sox5 in suppressing myehn gene expression in oligodendrocytes 
(Stolt et al., 2006). Sox5 expression is found in VZ cells, astrogha, and specific neuronal 
populations (Batista-Brito et al., 2009; Lai et al., 2008) and not in differentiating 
oligodendrocytes (Stolt et al., 2006). 

The diversity of neuronal progeny in the early postnatal brain contributes to the anatomical 
organization and cell specification (Lledo et al., 2008). Recent studies have revealed that 
distinct molecules mobilize stem cells toward neurogenesis in different regions (Lopez- 
Juarez et al., 2013). Many molecules exerts regulatory function in a region-specific manner 
(Brill et al., 2008; Lledo et al., 2008; Lopez-Juarez et al., 2013; Merkle et al., 2007). Given 
the regionally restricted expression of CR5-GFP, our results suggest that CR5 is an 
important regulatory element for proper lineage progression of Sox5h-/NG2h- subpopulation. 
Previous studies demonstrated that 01ig2 is not co-expressed with NG2 in the ventral 
forebrain; NG2H-/01ig2- cells in this area differentiated into astrocytes, but not 
oligodendrocytes (Zhu et al., 2012). A subpopulation of CR5-GFPh-/NG2h- in the transgenic 
ventral brain indicates that CR5 might play a role in controlling cell diversity in this brain 
region. In addition, a recent study suggests that Gsxl is likely to be a regulator in the 
development of lateral and ventral neural stem cells (Lopez-Juarez et al., 2013). Our finding 
that a subpopulation of CR5-GFPH- cells in transgenic animals were co-localized with Gsxl 
(Fig. 3F) further supports the notion that CR5 is involved in the generation of cell diversity 
in the forebrain. 

Gene regulatory ability of CR5 is attributed to the binding activities with specific protein 
factors. We identified Sox5 as a CR5 binding fran^-acting factor, which was confirmed by 
EMSA, Chip (Fig. 5), and site-directed mutagenesis and in vivo electroporation assays (Fig. 
6). Although no obvious changes in the level of GFP expression was detected with 
CRS^^'^'^-GFP construct, the mutant Sox5 binding site did alter the composition of reporter 
gene expression in the resulting GFPh- cell population: a dramatic decrease of the percentage 
of CR5^^°''^-GFPh- cells co-labeled with Sox5 and NG2, and a significant increase in the 
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percentage of CRS^"" -GFP+ cells co-labeled with BLBP (Fig. 6). These findings suggest 
that although Sox5 binding site is not required for the activation of CR5, it does affect the 
specificity of CR5 in Sox5 cell population. 

Sox5 belongs to SoxD group, which contains two other factors Sox6 and Soxl3 (Guth and 
Wegner, 2008). Our analysis showed that CR5 contains binding sites for Sox5, but not for 
Sox6 or Soxl3. For this reason, the role of Sox6 or Soxl3 in regulating CR5 activity was not 
examined. However, the possibility that Sox6 and/or Sox 13 may interact with Sox5 and 
participate in CR5 activation cannot be ruled out. 

A major challenge in understanding how various cell types in the CNS are generated is to 
elucidate the mechanism of cell-specific gene expression. The identification of the novel 
gene regulatory element in this study represents one step in this effort. Based on our 
findings, we propose that the c/.?-element CR5 and its binding factors participate in the 
regulation of cell-specific gene expression in a spatiotemporal manner. Our study provides 
an example of such cell-specific gene expression mediated by a direct interaction of trans- 
acting factors (e.g., Sox5 and possibly Phox2a) with a c/i-element (e.g., CR5). This study 
also represents a useful and effective method for the functional study of non-coding 
regulatory sequences and their binding protein factors in cell lineage development. 
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Fig. 1. 

CR5 exhibits activity in multipotent cells at PO, but not during embryonic stages. Gene 
regulatory activity of CR5 was tested during both embryonic and postnatal forebrain 
development. For postnatal experiments, PO S VZ was injected and electroporated with the 
experimental construct CR5-GFP or the control construct CAG-GFP individually. Brain 
tissues were harvested at P7 seven days after electroporation. Reporter expression was 
examined on sagittal sections. CR5-GFP+ cells were detected in the SVZ and "cc" areas of 
the postnatal forebrain (A). The control CAG-GFP+ cells were found in the SVZ, cc, and Cx 
areas and showed morphology of various cell types (B). Higher magnifications of the boxed 
areas in (A and B) are shown in (Al and Bl) and cells with specific morphology were 
indicated by arrowheads. (C) Diagrams of a mouse brain in sagittal plane depicting targeted 
cells by IVE at PO and detection of reporter gene expression at P7. The location of A and B 
is shown in C indicated by a blue box. For embryonic experiments, developing VZ were co- 
electroporated with the experimental construct CR5-GFP and the control construct CAG- 
DsRed at E15.5. Brain tissues were harvested at E17.5 two days after electroporation. 
Reporter expression (GFP and DsRed) was examined on sagittal sections in embryonic 
forebrain at £17. 5. No CR5-GFP expression was detected (D and E), while large amount of 
CAG-DsRed+ cells were observed in all transfected brains (red cells in D). (F) Diagrams of 
a mouse brain in sagittal plane depicting targeted cells by lUE at E15.5 and detection of 
reporter gene expression at E17.5. Blue boxed area in F is the location of D and E. Cx, 
cortex; cc, corpus callosum; LV, lateral ventricle; VZ, ventricular zone; SVZ, subventricular 
zone. lUE, in utero electroporation; IVE, in vivo electroporation. Scale bar=50 |jm. 
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Fig. 2. 

CR5 activity is preferentially in the Sox5+ and NG2+ progenitors. Sagittal sections of the P7 
mouse forebrain were immunostained with anti-GFP (green) antibody and specific cell 
markers (red). GFP+ cells observed in the SVZ, corpus callosum, and neocortical areas after 
in vivo electroporation at PO were further examined for the expression pattern of progenitor 
markers including SoxS (A and F), NG2 (B and G); 01ig2 (C and H); radial glia marker 
BLBP (D and I); and type C cell marker Mashl (E and J). Boxed areas were shown in green 
and red channel separately. Double-labeled cells were indicated by arrowheads. Cells only 
express GFP but not labeled with a cell marker were indicated by arrows. (K) A histograph 
showing the percentage of double labeled cells (GFP+/cell marker+) over GFP+ cells. Error 
bars indicate values of the standard deviation. The histograph also includes data of 
immunostaining with astrocyte markers GFAP, SIOOP, and neuronal marker NeuN (see Fig. 
S3). Compare with the control group, a higher percentage of the CR5-GFP+ cells were co- 
stained with SoxS, NG2. No significant difference was observed with the SIOOP, GFAP, 
BLBP, and Mashl co-labeling. The significance of difference between CR5 and the control 
group was assessed by student's t-test (see Table SI). {**p < 0.01; n=3). Cx, cortex; cc, 
corpus callosum; LV, lateral ventricle. Scale bar=20 |jm. 
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Fig. 3. 

CR5 is preferentially active in Sox5 expressing cells of the ventral forebrain around 
postnatal day 0. (A) Sagittal and coronal brain sections of PO transgenic mouse show that 
CR5-GFP+ cells were primarily located in ventral forebrain. Diagrams of the sagittal and 
coronal planes show the distribution pattern of CR5-GFP+ cells indicated by green dots. 
Immunostaining results with anti-GFP antibody of the boxed area were displayed on the 
right images. CR5-GFP+ cells in the ventral forebrain from sagittal samples were 
immunostained with specific cell marker Sox5 (C), NG2 (D), Pax6 (E), Gsxl (F), and PH3 
(G). A histograph showing the percentage of double labeled cells (GFP+/cell marker+) over 
GFP+ cells (B), including other markers, e.g., neuronal marker Tbrl and Tbr2; 
oligodendrocyte marker 01ig2 and PDGFRa, radial glia marker GFAP and BLBP; and 
astrocyte marker SlOOfi (Fig. S4). Error bars indicate values of the standard deviation (see 
Table S2). Scale bar=50 [im. 
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Fig. 4. 

Sox5 expression is in the neonatal ventral forebrain. Sagittal sections from mouse brain at 
PO (A-C), E17.5 (D and E), and P7 (D and F) were immunostained with anti-Sox5 
antibody. Strong Sox5 expression was detected in both the ventral forebrain (B) and dorsal 
cortex (C) at PO, but not detected in the E17.5 ventral forebrain (E). Weak Sox5 expression 
was detected in the P7 ventral forebrain (F). Diagram of sagittal brain are shown in A and D 
Dashed boxed area in the diagram are shown in the right images with corresponding labels. 
Sox5 expression pattern at PO brain is indicated by red dots while CR5-GFP expression is 
indicated by green dots. VB, ventral forebrain; CX: cortex. Scale bar=50 |am. 



Dev Biol. Author manuscript; available in PMC 2014 September 19. 



Hao et al. 



Page 19 



PhoxZa 



.trii 

Probe #1 



Phox2 




C$A ^ 



It 



A cn 



B Probe #1 



free probe 
NE 

competitor 



1234567 
-++++++ 
- -+-+- + 

PO P7 Adult 



Probe #2 



SoxS 



free probe I 




NE 

competitor 



1234567 
-++++++ 
- - + - + - + 

PO P7 Adult 



500 bp 
400 bp 
300 bp 

500 bp 
400 bp 
300 bp 

Ikb 

500bp 
Ikb 

SOObp 



3 

a 
c 



O 

(A 



X 

o 

Q. 



CN 
X 

o 

0. 



in 

X 

o 
to 



a in 



o o "S 
^ ^ o 
a. a. (0 







P7 


II 


P14 


P2' 







14 




P7 


PO 


1 


P14 




P21 



Fig. 5. 

CR5 interacts with specific nuclear protein factors. Analysis of homologous CR5 sequences 
from 8 species by a web based sequence analysis tool Weblogo revealed two highly 
conserved motifs (A and C). Electrophoretic mobility shift assays (EMSA) were performed 
to identify the in vitro binding activity of nuclear proteins with CR5. Among a total number 
of seven probes that cover the entire CR5, probe #1 (A and B) and probe #2 (C and D) 
showed binding activity. The competition assay was carried out using unlabeled probes at 
40-fold higher concentration. The nuclear extracts from developing mouse brain at various 
stages were obtained (PO: lanes 2 and 3; P7: lanes 4 and 5; Adult: lanes 6 and 7). For probe 
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#1, the arrowheads indicate the retarded band obtained using the PO brain nuclear extracts 
(lane 2). The shift disappeared when competitor was added (lane 3). No shift was observed 
in the assays using P7 and adult mouse brain nuclear extracts (lanes 4-7). For Probe #2, the 
retarded bands were observed for all the stages (arrowhead, lane 2, 4, 6). The shift 
diminished when competitor was added (arrowhead, lane 3, 5, 7). Chromatin 
immunoprecipitation (ChIP) assays detected in vivo binding of protein factors (Phox2a, 
Phox2b, and Sox5) to CR5 (E). Chromatin obtained from PO, P7, P14, and P21 brain tissues 
of mouse pups was immunoprecipitated using antibodies against individual protein factor. 
The pure mouse IgG antibody was used as a negative control. The input represented 1% of 
the total chromatin extract. The precipitated DNA fragments were amplified by a set of 
primers flanking CR5 sequences. The PCR products of the expected size of 416 bp were 
obtained. As a negative control, primers flanking a random sequence of -700 bp which does 
not include specific binding sites were tested and no PCR product was detected. 
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Fig. 6. 

SoxS binding site affects gene regulatory activity of CR5. Schematic representation of 
plasmid constructs of CR5-GFP (A) and CR5^^°''5-GFP with deletion of "AT" from SoxS 
binding site "CAAT" (B). Mutated SoxS binding site is indicated by a red X. Sagittal 
sections of the P7 mouse forebrain after in vivo electroporation at PO showing GFP+ cells in 
transfected brain tissues with CRS-GFP (A') or CRS'^^°''^-GFP (BO constructs. The location 
of A'-B' is indicated by blue box in H. GFP+ cells were further examined for the expression 
of neural stem/progenitor markers: SoxS (C), NG2 (D), OUg2 (E), and BLBP (F). A 
histograph showing the percentage of double labeled cells (GFP+/cell marker+) over GFP+ 
cells (G). Error bars indicate values of the mean of standard deviation. Compare with the 
wild-type CRS group, a lower percentage of CRS^^°''^-GFP+ cells express neural stem/ 
progenitor markers SoxS and NG2; and a higher percentage of CRS^^^^-GFPh- cells 
express BLBP. (H) Diagrams of a mouse brain in sagittal plane depicting targeted cells by 
IVE at PO and detection of reporter gene expression at P7. The significance of difference 
was assessed by student's t-test. {**p < 0.01; n=3). Cx, cortex; cc, corpus callosum; LV, HP, 
Hippocampus; lateral ventricle. Scale bar=SO |jm. 
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Table 2 



A list of EMS A probes for CR5. 



EMSA probes 


Forward sequence 


Potential binding 
transcription factors 


Probe #1 


GCCCTGGGACCCCCACCAATAAATTATGGGTGGACATTAGGGGAGAGCCCAGGA 


WTl 






PHOX2a 






ISL2 






PAX6 






PITl 






ENl 






PHOX2 


Probe #2 


CGTGAGGGCAGCCTGCATTGTAAATTACAATTAAAACAGAAACAGACAGTTCCT 


SOX5 






HOXB6 






MEOXl 






NKX25 






ENl 






HOXB8 


Probe #3 


ATGACCAAGATGGGGACATTGTGTTTACCTACTTGAG 




Probe #4 


ACCTACTTGAGCAGAGGAGAAGGTGACCGTGAGGGCAGCCT 




Probe #5 


TCAGATAATCGCCTCCCTCCCGGCTGTCAGGGGTGCAGCCACTGCCAAT 




Probe #6 


CAGCCACTGCCAATTCACAGCGCCCTCCGAGAAAGTACCCTTGTCTGTGAT 




Probe #7 


GATGGCACACTCCATTTGATAATGGCTCTCATCTGCCTCAGATAATCGCC 





Dev Biol. Author manuscript; available in PMC 2014 September 19. 



